Web Data Extraction for Business Intelligence: The Lixto Approach

نویسنده

  • Georg Gottlob
چکیده

Knowledge about market developments and competitor activities on the market becomes more and more a critical success factor for enterprises. The World Wide Web provides public domain information which can be retrieved for example from Web sites or online shops. The extraction from semi-structured information sources is mostly done manually and is therefore very time consuming. This paper describes how public information can be extracted automatically fromWeb sites, transformed into structured data formats, and used for data analysis in Business Intelligence systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Lixto Systems Applications in Business Intelligence and Semantic Web

This paper shows how technologies for Web data extraction, syndication and integration allow for new applications and services in the Business Intelligence and the Semantic Web domain. First, we demonstrate how knowledge about market developments and competitor activities on the market can be extracted dynamically and automatically from semi-structured information sources on the Web. Then, we s...

متن کامل

Lixto – Price Intelligence Suite

IMPACT The Lixto Price Intelligence Suite is a solution that extracts price-comparison information from competitor online web channels and combines it with internal data sources so organizations gain greater visibility into the factors that might influence price. The suite works by navigating and extracting competitive product and pricing information from predefined online data sources before s...

متن کامل

Scalable Web Data Extraction for Online Market Intelligence

Online market intelligence (OMI), in particular competitive intelligence for product pricing, is a very important application area for Web data extraction. However, OMI presents non-trivial challenges to data extraction technology. Sophisticated and highly parameterized navigation and extraction tasks are required. On-the-fly data cleansing is necessary in order two identify identical products ...

متن کامل

Web Information Acquisition with Lixto Suite: A Demonstration∗

We demonstrate the Lixto Suite, a web data extraction and transformation software kit for retrieving and converting information from various sources to various customer devices. With the Lixto Suite, non-technical content managers can rapidly develop applications in the areas of M-Commerce, E-Commerce, content integration and corporate portals.

متن کامل

Intelligent Wrapping from PDF Documents

Wrapping is the process of navigating a data source, semiautomatically extracting data and transforming it into a form suitable for data processing applications. The semi-structured form of web pages, coupled with the availability of business-relevant data, has led to the availability of several established products on the market for wrapping data from the Web. One such approach is the Lixto me...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005